Just Relax and Come Clustering! A Convexification of k-Means Clustering, Report no. LiTH-ISY-R-2992

نویسندگان

  • Fredrik Lindsten
  • Henrik Ohlsson
  • Lennart Ljung
چکیده

k-means clustering is a popular approach to clustering. It is easy to implement and intuitive but has the disadvantage of being sensitive to initialization due to an underlying non-convex optimization problem. In this paper, we derive an equivalent formulation of k-means clustering. The formulation takes the form of a `0-regularized least squares problem. We then propose a novel convex, relaxed, formulation of k-means clustering. The sum-ofnorms regularized least squares formulation inherits many desired properties of k-means but has the advantage of being independent of initialization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS

Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...

متن کامل

Clustering using sum-of-norms regularization; with application to particle filter output computation, Report no. LiTH-ISY-R-2993

We present a novel clustering method, SON clustering, formulated as a convex optimization problem. The method is based on over-parameterization and uses a sum-of-norms regularization to control the trade-o between the model t and the number of clusters. Hence, the number of clusters can be automatically adapted to best describe the data, and need not to be speci ed a priori. We apply SON cluste...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011